\(~\) \(~\)

RSAT matrix-clustering

RSAT matrix-clustering (1) is a software to cluster and align Transcription Factor binding motifs. Here is a brief description of the method:

  • Motif comparison: The motifs are compared to each other using two comparison metrics (pearson correlation coeficient (cor) and a alignment-width correction (normalized pearson correlation (Ncor)).

  • Hierarchical clustering: The motifs are hierarchically clustered based in the values of a comparison metric (default = Ncor) .

  • Tree partition: the hierarchical tree is partitioned by calculating the average cor and Ncor values at each node, each time a node does not satisfy the thresholds (one value for cor and another for Ncor) the node is split in two clusters.

  • Motif alignment: for each cluster, the motifs are progressively aligned following the linkage order of the hierarchical tree, this ensures that each motif is aligned in relation to its most similar motif in the cluster.

We recently updated the algorithm to enable a radial tree visualization, you can find many examples within the JASPAR (2) database website (click on the Radial tree buttons).

In this repository you will find an example to reproduce the JASPAR nematodes radial tree, containing 43 motifs correspoding to 12 TF classes.

Figure 1. JASPAR nematodes motifs clustered and aligned in a radial tree. The color ring and its number represent different TF classes.

Figure 1. JASPAR nematodes motifs clustered and aligned in a radial tree. The color ring and its number represent different TF classes.

\(~\) \(~\)

Run RSAT matrix-clustering

Generate a radial tree

In this repository we assume you already installed RSAT (3) in your system, alternatively you can run RSAT matrix-clustering in one of the RSAT webservers.

If you are running RSAT matrix-clustering with the command line, you can use the following command:

\(~\)


matrix-clustering -v 2                                        \
-matrix jaspar_2022_nematodes JASPAR2022_CORE_nematodes.tf tf \
-title 'JASPAR 2022 nematodes CORE'                           \
-hclust_method average                                        \
-calc sum                                                     \
-metric_build_tree Ncor                                       \
-lth w 5 -lth cor 0.6 -lth Ncor 0.4                           \
-label_in_tree name                                           \
-return json                                                  \
-quick                                                        \
-radial_tree_only                                             \
-o results/JASPAR2022_CORE_nematodes/JASPAR2022_CORE_nematodes

\(~\)

NOTE: when running in the command line, don’t forget to write the parameter -radial_tree_only, in the webserver this option must be activated as shown in the Figure 2.

\(~\)

Figure 2. Click on the 'Export Radial Tree' box.

Figure 2. Click on the ‘Export Radial Tree’ box.

\(~\) \(~\)

Install apache

The file data/JASPAR_2022_CORE_nematodes_matrix-clustering/matrix-clustering_radial_tree.html contains the radial tree in an html file. Unfortunately, to visualize the content of this file is required to have installed apache2 and open this html file as a localhost. If you don’t do this, you will not see the html content.

To install apache you can follow this instructions.

Once apache is installed in your computer:

  1. Remove the folder sudo rm -rf /var/www/html
  2. Copy the RSAT matrix-clustering folder to /var/www/
  3. Open your browser and type localhost. Now you can browse the files in /var/www/
  4. Search and open the file matrix-clustering_radial_tree.html

\(~\)

sudo rm -rf /var/www/html

sudo cp -r data/JASPAR_2022_CORE_nematodes_matrix-clustering /var/www/html/JASPAR_2022_CORE_nematodes_matrix-clustering

\(~\)

Figure 3. As a localhost, open the matrix-clustering_radial_tree.html.

Figure 3. As a localhost, open the matrix-clustering_radial_tree.html.

\(~\) \(~\)

After running this step, you will see this ugly tree as in Figure 4.

This example will take ~2 minutes, but keep in mind that the running time varies according to the number of input motifs.

Briefly, this radial tree contains the following elements (from inside to outside):

  • Hierarchical tree
  • The colors in the tree branches correspond to the clusters identified by RSAT matrix-clustering.
  • Motif IDs (or TF names)
  • Motif logo

\(~\)

Figure 4. Radial tree without outter ring and annotations.

Figure 4. Radial tree without outter ring and annotations.

\(~\)

In the following sections we will show what to do to modify tree in Figure 3 and obtain a nice tree as in Figure 1.

We will modify the next features:

  • Adapt space between motif IDs and logos
  • Font size
  • Logo size
  • Add annotations (color ring)
  • Add background (same as color ring)

\(~\) \(~\)

Annotate radial tree

To annotate the radial tree we need a table (without header) with the following data in this order:

  1. Collection name (the same name provided when running RSAT matrix-clustering with the parameter -matrix. See section Generate a radial tree).

  2. Motif ID (the exact same ID as in your input motif, case sensitive).

  3. Color in hexadecimal code.

  4. Text that will be displayed in the annotation ring. In this example we are using numbers but in principle any text is allowed.

\(~\)

jaspar_2022_nematodes   MA0545_1    #743190 1
jaspar_2022_nematodes   MA1438_1    #C97182 2
jaspar_2022_nematodes   MA1704_1    #C97182 2
jaspar_2022_nematodes   MA0537_1    #DAB877 3
jaspar_2022_nematodes   MA0260_1    #DAB877 3
jaspar_2022_nematodes   MA0543_1    #DAB877 3
jaspar_2022_nematodes   MA0923_1    #DAB877 3
jaspar_2022_nematodes   MA1703_1    #DAB877 3
jaspar_2022_nematodes   MA1450_1    #907499 4
jaspar_2022_nematodes   MA0262_1    #6CA47C 5

\(~\)

When this table is ready, run the following command.

  • --annotation : the table described above.
  • --input : the folder containing all the RSAT matrix-clustering results.
## Add the annotation +  color ring + background
Rscript annotate_matrix-clustering.R                                             \
    --annotation data/Extra/JASPAR_2022_nematodes_CORE_annotations_radial_tree.tsv \
    --input      data/JASPAR_2022_CORE_nematodes_matrix-clustering

\(~\)

After running this script, copy the folder data/JASPAR_2022_CORE_nematodes_matrix-clustering to /var/www/, this time it will include the html file with the annotated tree.

\(~\)

sudo cp -r data/JASPAR_2022_CORE_nematodes_matrix-clustering /var/www/html/JASPAR_2022_CORE_nematodes_matrix-clustering

\(~\)

Figure 5. As a localhost, open the file matrix-clustering_radial_tree_annotated.html.

Figure 5. As a localhost, open the file matrix-clustering_radial_tree_annotated.html.

\(~\)

The radial tree is almost ready (Figure 6), you can see the following differences:

  • Each motif is highlighted with a background color, provided in the annotation table (4th column).
  • There is a colored ring around the radial tree, each fragment of the ring correspond to a motif, the fragment color is the same as the motif backgorund color.

\(~\)

Figure 6. Radial tree without manual customization.

Figure 6. Radial tree without manual customization.

\(~\) \(~\)

Manual customization

The next and final step is to manually customize some elements in the tree.

  1. Font size (Figure 7)
  2. Space between motif names and logos (Figure 8)
  3. Logo size (Figure 9)
  4. Align logo with text (Figure 10)
  5. Insert text in annotation ring (Figure 11)
  6. Distance between ring and tree center (Figure 12)
  7. Tree branches thickness (Figure 13)

To do these changes, pen the file data/JASPAR_2022_CORE_nematodes_matrix-clustering/matrix-clustering_radial_tree_annotated.html with a text editor change the parameters indicated in the following sections.

\(~\)

Font size

\(~\)

Figure 7. Change the number in the 'font-size' parameter. Default: 12.

Figure 7. Change the number in the ‘font-size’ parameter. Default: 12.

\(~\)

Space between motif names and logos

\(~\)
Figure 8. Modify the 'x' attribute to adapt the distance between the logo and the motif id. Default: 70.

Figure 8. Modify the ‘x’ attribute to adapt the distance between the logo and the motif id. Default: 70.

\(~\)

Logo size

\(~\)
Figure 9. Modify the 'height' attribute to adapt the logo size. Default: 20.

Figure 9. Modify the ‘height’ attribute to adapt the logo size. Default: 20.

\(~\)

Align logo with text

\(~\)
Figure 10. Modify the 'y' attribute to move the logo along the Y-axis and align it with the text. Default: 0.

Figure 10. Modify the ‘y’ attribute to move the logo along the Y-axis and align it with the text. Default: 0.

\(~\)

Add text in color ring

Remove the text between the block of code starting by node.append("image") and the block of code starting by vis.selectAll('g.text'). Then copy and paste the following text.

\(~\)

        
  var innerRad_start = 205;
    var innerRad_end = 450;
 
    
    // Add background elements for selection
     vis.selectAll('rect_selection')
        .data(data_sample)
        .enter()
        .append('path')
        .attr('class', 'rect1')
        .attr('id', function(d){ return( 'rect_sel' + d.id_motif) })
        .attr('d', d3.svg.arc()
          .startAngle(function(d){ return( (d.start * Math.PI)/180)  }  )  //converting from degs to radians
          .endAngle(function(d){   return( (d.end   * Math.PI)/180)  }  ) //just radians
          .innerRadius(innerRad_start)         // This is the size of the donut hole
          .outerRadius(innerRad_end)
        )
        .style('fill',function(d){return(d.class)})
        .attr('transform', 'translate(0,0)')
        .style('stroke-width', '1px')
        .style('opacity', 0.15);

            
          // Color algorithm annotation layer
        vis.selectAll('annotations')
            .data(data_sample)
            .enter()
            .append('path')
            .attr('class','annotation1')
            .attr('id', function(d,i){return('path' + i) })
            .attr('d', d3.svg.arc()
              .startAngle(function(d){ return( d.start * (Math.PI/180) ) }  )  //converting from degs to radians
              .endAngle(function(d){ return( d.end * (Math.PI/180) ) }  ) //just radians
              .innerRadius(innerRad_end)         // This is the size of the donut hole
              .outerRadius(innerRad_end + (30*1))
            )
            .attr('fill', function(d){return(d.class)})
            .attr('stroke', 'white')
            .attr('transform', 'translate(0,0)')
            .style('stroke-width', '2px')
            .style('opacity', 1);
            

        // Color algorithm Non-Validated text layer
        vis.selectAll('annotation1')
            .data(data_sample)
            .enter()
            .append('text')
            .attr('dy', 20)
            .attr('x',33)
            .style('font-size', '15px')
            .append('textPath')
            .attr('startOffset','50%')
            .style('text-anchor','middle')
            //.attr('stroke','black')
            //.attr('fill','black')
            .attr('xlink:href', function(d,i){return('#path' + i) })
            .text(function(d){
              var token = d.matrix_name.split('_').slice(-2).slice(0);
              token = token[0];
              if(/^UN/.test(token)){
                var text_content = d.class_nb + '*';
              } else {
                var text_content = d.class_nb;
              }
              return(text_content);
            })
            ;
\(~\)
Figure 11. Insert the text above in this part of the code.

Figure 11. Insert the text above in this part of the code.

\(~\)

Distance between ring and tree center

\(~\)
Figure 12. Change the variable 'innerRad_end' to adapt where the background color ends.

Figure 12. Change the variable ‘innerRad_end’ to adapt where the background color ends.

\(~\)

Tree branches thickness

\(~\)
Figure 13. In the CSS section, change the parameter 'stroke-width'. Default: 1px.

Figure 13. In the CSS section, change the parameter ‘stroke-width’. Default: 1px.

\(~\)

Radial tree ready

After all the manual changes, you should see a tree as in Figure 14. Remember that this file can only be visualized in in a folder read by apache.

\(~\)
Figure 14. After all the manual changes, the radial tree should look like this one.

Figure 14. After all the manual changes, the radial tree should look like this one.

\(~\)

References

1. Castro-Mondragon,J.A., Jaeger,S., Thieffry,D., Thomas-Chollier,M. and Helden,J. van (2017) RSAT matrix-clustering: dynamic exploration and redundancy reduction of transcription factor binding motif collections. Nucleic Acids Research, 45, e119–e119.
2. Castro-Mondragon,J.A., Riudavets-Puig,R., Rauluseviciute,I., Berhanu Lemma,R., Turchi,L., Blanc-Mathieu,R., Lucas,J., Boddie,P., Khan,A., Manosalva Pérez,N., et al. (2021) JASPAR 2022: the 9th release of the open-access database of transcription factor binding profiles. Nucleic Acids Research, 50, gkab1113–.
3. Santana-Garcia,W., Castro-Mondragon,J.A., Padilla-Gálvez,M., Nguyen,N.T.T., Elizondo-Salas,A., Ksouri,N., Gerbes,F., Thieffry,D., Vincens,P., Contreras-Moreira,B., et al. (2022) RSAT 2022: regulatory sequence analysis tools. Nucleic Acids Research, 50, W670–W676.